DBIG-US: A two-stage under-sampling algorithm to face the class imbalance problem

نویسندگان

چکیده

The class imbalance problem occurs when one far outnumbers the other classes, causing most traditional classifiers perform poorly on minority classes. To tackle this problem, a plethora of techniques have been proposed, especially centered around resampling methods. This paper introduces two-stage method that combines DBSCAN clustering algorithm to filter noisy majority instances with graph-based procedure overcome imbalance. We then experimentally evaluate behavior proposed collection two-class imbalanced data sets. experimental results show an improvement in classification performance measured by geometric mean accuracy each and also higher reduction ratio compared several state-of-the-art under-sampling techniques.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Multiple Expert Approach to the Class Imbalance Problem Using Inverse Random under Sampling

In this paper, a novel inverse random under sampling (IRUS) method is proposed for class imbalance problem. The main idea is to severely under sample the negative class (majority class), thus creating a large number of distinct negative training sets. For each training set we then find a linear discriminant which separates the positive class from the negative class. By combining the multiple de...

متن کامل

Semi Supervised Under-sampling: a Solution to the Class Imbalance Problem for Classification and Feature Selection

Most medical datasets are not balanced in their class labels. Furthermore, in some cases it has been noticed that the given class labels do not accurately represent characteristics of the data record. Most existing classification methods tend not to perform well on minority class examples when the dataset is extremely imbalanced. This is because they aim to optimize the overall accuracy without...

متن کامل

the algorithm for solving the inverse numerical range problem

برد عددی ماتریس مربعی a را با w(a) نشان داده و به این صورت تعریف می کنیم w(a)={x8ax:x ?s1} ، که در آن s1 گوی واحد است. در سال 2009، راسل کاردن مساله برد عددی معکوس را به این صورت مطرح کرده است : برای نقطه z?w(a)، بردار x?s1 را به گونه ای می یابیم که z=x*ax، در این پایان نامه ، الگوریتمی برای حل مساله برد عددی معکوس ارانه می دهیم.

15 صفحه اول

Adaptive Ensemble Selection for Face Re-identification under Class Imbalance

Systems for face re-identification over a network of video surveillance cameras are designed with a limited amount of reference data, and may operate under complex environments. Furthermore, target individuals provide a small proportion of the facial captures for design and during operations, and these proportions may change over time according to operational conditions. Given a diversified poo...

متن کامل

Author identification: Using text sampling to handle the class imbalance problem

Authorship analysis of electronic texts assists digital forensics and anti-terror investigation. Author identification can be seen as a single-label multi-class text categorization problem. Very often, there are extremely few training texts at least for some of the candidate authors or there is a significant variation in the text-length among the available training texts of the candidate author...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Expert Systems With Applications

سال: 2021

ISSN: ['1873-6793', '0957-4174']

DOI: https://doi.org/10.1016/j.eswa.2020.114301